Summarizing Blog Entries versus News Texts

نویسندگان

  • Shamima Mithun
  • Leila Kosseim
چکیده

As more and more people are expressing their opinions on the web in the form of weblogs (or blogs), research on the blogosphere is gaining popularity. As the outcome of this research, different natural language tools such as querybased opinion summarizers have been developed to mine and organize opinions on a particular event or entity in blog entries. However, the variety of blog posts and the informal style and structure of blog entries pose many difficulties for these natural language tools. In this paper, we identify and categorize errors which typically occur in opinion summarization from blog entries and compare blog entry summaries with traditional news text summaries based on these error types to quantify the differences between these two genres of texts for the purpose of summarization. For evaluation, we used summaries from participating systems of the TAC 2008 opinion summarization track and updated summarization track. Our results show that some errors are much more frequent to blog entries (e.g. topic irrelevant information) compared to news texts; while other error types, such as content overlap, seem to be comparable. These findings can be used to prioritize these error types and give clear indications as to where we should put effort to improve blog

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Clustering blog entries based on the hybrid document model enhanced by the extended anchor texts and co-referencing links

In this paper, we propose a document vector space model where weights of noun terms vary depending on positions within the texts of blog entries as search results. We extend “extended anchor texts” (i.e., extra texts surrounding anchor texts) with the exponential potential such that the weight of a noun term decreases exponentially as the distance between the term and link increases. In order t...

متن کامل

Automatically Linking News Articles to Blog Entries

People often write in their blogs about news articles or events in news articles. In this case, however, the details of the news articles or events are often poorly described in such blog entries. Therefore, the readers of blogs need to find the original articles, which contain more details of the news articles, when they want to know about them. In this paper, we propose a method for linking n...

متن کامل

Linking Topics of News and Blogs with Wikipedia for Complementary Navigation

We study complementary navigation of news and blog, where Wikipedia entries are utilized as fundamental knowledge source for linking news articles and blog feeds/posts. In the proposed framework, given a topic as the title of a Wikipedia entry, its Wikipedia entry body text is analyzed as fundamental knowledge source for the given topic, and terms strongly related to the given topic are extract...

متن کامل

Contrasting Objective and Subjective Portuguese Texts from Heterogeneous Sources

This paper contrasts the content and form of objective versus subjective texts. A collection of on-line newspaper news items serve as objective texts, while parliamentary speeches (debates) and blog posts form the basis of our subjective texts, all in Portuguese. The aim is to provide general linguistic patterns as used in objective written media and subjective speeches and blog posts, to help ...

متن کامل

Algorithm and Implementation of the Blog-Post Supervision Process

A web log or blog in short is a trendy way to share personal entries with others through website. A typical blog may consist of texts, images, audios and videos etc. Most of the blogs work as personal online diaries, while others may focus on specific interest such as photographs (photoblog), art (artblog), travel (tourblog), IT (techblog) etc. Another type of blogging called microblogging is a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009